{
 "metadata": {
  "name": "",
  "signature": "sha256:7bbd44a78f5201c0b3188f01df1f53a3d73bd16577d1d582c0707b1c15c46320"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "# `pandas` 101\n",
      "\n",
      "- [official docs](http://pandas.pydata.org/pandas-docs/stable/index.html)\n",
      "\n",
      "- name from \"panel data\" in econometrics, created by Wes McKinney\n",
      "\n",
      "- designed for split-apply-combine and time-series manipulations, combines many behaviors of R's `plyr`, `reshape2`, ... packages into one\n",
      "\n",
      "- generally favor *immutability* (most operations will / should produce new objects)\n",
      "\n",
      "- fundamental data structures: Series, DataFrame"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## part 0: check your installs\n",
      "\n",
      "Already have `pandas` installed? Use the `script` [magic function](http://nbviewer.ipython.org/urls/raw.github.com/ipython/ipython/1.x/examples/notebooks/Cell%20Magics.ipynb) below to see. If its not installed, get your `pip` on and get all the packages from my email."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%script bash\n",
      "whoami\n",
      "pip freeze | grep \"pandas\\|numpy\\|matplotlib\""
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# standard import naming conventions \n",
      "import numpy as np\n",
      "import pandas as pd\n",
      "import matplotlib.pyplot as plt\n",
      "%matplotlib inline"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "For some more background on the relationship between `matplotlib`, `pyplot`, `pylab`, check out [this link](http://bit.ly/197aGoq). The `inline` [magic function](http://ipython.org/ipython-doc/dev/interactive/tutorial.html#magic-functions) displays our plots in an Output notebook cell. "
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## part 1: data structures\n",
      "\n",
      "There are two\\* main structures in `pandas`: Series (1-dimensional labeled array) and DataFrame (2-dimensional labeled structure).\n",
      "\n",
      "\n",
      "\\* there is also a TimeSeries (a flavor of Series that contains datetimes), Panel (3-dimensional), and Panel4D (4-dimensional). The last two are 'less used,' according to the docs. I haven't experimented with them yet."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Series (1D)\n",
      "\n",
      "Series can hold any data type, and the axis label is called an index. Series is dict-like in that you can get and set values by index label. "
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "s1 = pd.Series([1,3,5,np.nan,6,8])\n",
      "s1"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# by default (without specifying them explicitly), the index label is just an int\n",
      "s1[5]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### DataFrame (2D)\n",
      "\n",
      "Columns can be of different data types. Index and column names are optional. If individual Series have different indexes, the DataFrame index will be the union of the individual ones.\n",
      "\n",
      "Can create from:\n",
      "\n",
      "- dict of 1D ndarrays, lists, dicts, or Series\n",
      "\n",
      "- 2-D numpy.ndarray\n",
      "\n",
      "- Series\n",
      "\n",
      "- another DataFrame\n",
      "\n",
      "\n",
      "N.B.: there are other helper methods for constructing DataFrames from varying data types; [see the docs](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#alternate-constructors) for more options."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# create a couple more Series\n",
      "s2, s3 = pd.Series(np.random.randn(6)), pd.Series(np.random.randn(6))"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# combine multiple Series into a DataFrame with column labels\n",
      "df_1 = pd.DataFrame({'A': s1, 'B': s2, 'C': s3})\n",
      "\n",
      "df_1"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# when Series are different lengths, DataFrame fills in gaps with NaN\n",
      "s4 = pd.Series(np.random.randn(8))  # whoaaaaaa this Series has extra entries!\n",
      "\n",
      "df1 = pd.DataFrame({'A': s1, 'B': s2, 'C': s3, 'D': s4})\n",
      "\n",
      "df1 "
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# create a DataFrame from numpy array\n",
      "df2 = pd.DataFrame(np.random.randn(6,4))\n",
      "\n",
      "df2             # can only have one 'pretty' output per cell (if it's the last command)\n",
      "\n",
      "#print df2       # otherwise, can print arb number of results w/o pretty format\n",
      "#print df1       # (uncomment both of these print statements)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Can inspect your DataFrames with `head()` and `tail()` methods - takes a number of lines as an argument. \n",
      "\n",
      "Without specifiying them, DataFrames have default index and column name attributes."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# recall current dataframe \n",
      "df2.head(2)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "But you can assign to those attributes of the DataFrame..."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "cols = ['a', 'b', 'c', 'd']\n",
      "\n",
      "# assign columns attribute (names) \n",
      "df2.columns = cols\n",
      "\n",
      "# create an index:\n",
      "#  generate a sequence of dates with pandas' data_range() method,\n",
      "#  then assign the index attribute\n",
      "dates = pd.date_range(start='2013-11-24 13:45:27', freq='W', periods=6)\n",
      "df2.index = dates\n",
      "\n",
      "df2"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# an aside: inspecting the dates object...\n",
      "print 'what is a date_range object?\\n\\n', dates"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Do some indexing / subsetting..."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# select a row by index label by using .loc \n",
      "df2.loc['2013-12-01 13:45:27']"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# select a single element\n",
      "df2.loc['2013-12-22 13:45:27','c']"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# select a set of rows --- n.b.: this is broken, possibly a bug with DateTimeIndex labels?\n",
      "df2.loc[['2013-12-01 13:45:27','2013-12-08 13:45:27']]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Sadness!!\n",
      "\n",
      "To make up for the sadness of that last cell, here's an example that does work from the docs...."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# new dataframe with random numbers\n",
      "df1 = pd.DataFrame(np.random.randn(6,4), index=list('abcdef'),columns=list('ABCD'))\n",
      "\n",
      "df1"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# address two separate rows, and a range of three columns\n",
      "df1.loc[['d','f'],'A':'C']"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "DataFrames have a `melt` method that behaves very similarly to R's `reshape2` package. To convert a 'long' DataFrame back into a 'wide' one, use the `pivot` method."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# .core.* apparently isn't imported with the general 'import pandas'\n",
      "from pandas.core.reshape import melt"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# recall the df we're using\n",
      "df2"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# melt and keep 'a' and 'b' columns as id variables (like R)\n",
      "df3 = melt(df2, id_vars=['a','b'])\n",
      "\n",
      "df3"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Note that the melted DataFrame lost its indexes. If that index was a measurement or otherwise useful parameter, the approach would be to create a new column with the appropriate values in it, then apply the `melt` - that column could become a variable just as `c` and `d` in the example."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## part 2: data\n",
      "\n",
      "In the `data/` directory is the sample of parsed twitter data that floats around with gnacs. To create the string of column names, I just used the explain option with all other options."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "gnacs_x = \"id|postedTime|body|None|['twitter_entiteis:urls:url']|['None']|['actor:languages_list-items']|gnip:language:value|twitter_lang|[u'geo:coordinates_list-items']|geo:type|None|None|None|None|actor:utcOffset|None|None|None|None|None|None|None|None|None|actor:displayName|actor:preferredUsername|actor:id|gnip:klout_score|actor:followersCount|actor:friendsCount|actor:listedCount|actor:statusesCount|Tweet|None|None|None\"\n",
      "colnames = gnacs_x.split('|')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# prevent the automatic compression of wide dataframes (add scroll bar)\n",
      "pd.set_option(\"display.max_columns\", None)\n",
      "\n",
      "# get some data, inspect\n",
      "df1 = pd.read_csv('data/twitter_sample.csv', sep='|', names=colnames)\n",
      "\n",
      "df1.tail(7)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Since there are so many explain fields that come back with 'None', let's just get rid of them for now. \n",
      "\n",
      "(In the future, we might try to find a way to make that field more descriptive, too.)"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# n.b.: this is an *in-place* delete -- unusual for a pandas structure\n",
      "del df1['None']\n",
      "\n",
      "# The command below is how the docs suggest carrying this out (creating a new df). \n",
      "#   But, it doesn't seem to work -- possibly due to multiple cols with same name. Oh well. \n",
      "#new_df = df1.drop('None', axis=1)  # return new df"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# have a peek\n",
      "df1.tail(6)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## slicing & combining\n",
      "\n",
      "Subsetting a DataFrame is very similar to the syntax in R. There are two ways to select columns: 'dot' (attribute) notation, and 'square bracket' (index) notation. Sometimes, the column names will dictate which you have to use."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# inspect those rows with twitter-classified lang 'en' (scroll the right to see)\n",
      "df1[df1.twitter_lang == 'en'].head()\n",
      "\n",
      "# the colons in the column name below won't allow dot-access to the column, so we can quote them and still filter.\n",
      "#df1[df1[\"gnip:language:value\"] == 'en'].head()  "
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Let's get a subset of this dataframe that has numerical values so we can eventually do some stuff."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# create new dataframe from numerical columns\n",
      "df2 = df1[[\"gnip:klout_score\",\"actor:followersCount\", \"actor:friendsCount\", \"actor:listedCount\"]]\n",
      "\n",
      "df2.head()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# because I happen to know the answer, let's check data types of the columns...\n",
      "df2.dtypes  "
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "The `object` type means that the column has multiple types of data in it. This is a good opportunity to 'fix' a section of the DataFrame by way of a function & the `map()` function"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# convert ints / strings to floats, give up on anything else (call it 0.0)\n",
      "def floatify(val):\n",
      "    if val == None or val == 'None':\n",
      "        return 0.0\n",
      "    else:\n",
      "        return float(val)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# assigning to an existing column overwrites that column \n",
      "df2['gnip:klout_score'] = df2['gnip:klout_score'].map(floatify)\n",
      "\n",
      "# check again\n",
      "df2.dtypes"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# use all floats just for fun. \n",
      "#  this only works if the elements can all be converted to floats (e.g. ints or something python can handle) \n",
      "df2 = df2.astype(float)\n",
      "\n",
      "df2.dtypes"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Since they're all numbers now, we can do math and also add new columns to the DataFrame. Combining values from separate columns occurs on a row-by-row basis, as expected."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# look at some activity ratios - add col to df\n",
      "df2['fol/fr'] = df2['gnip:klout_score'] / df2['actor:followersCount']\n",
      "\n",
      "df2.head()\n",
      "\n",
      "# can also use the built-in describe() method to get quick descriptive stats on the dataframe\n",
      "#df2.describe()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## grouping\n",
      "\n",
      "`groupby()` is used for the split-apply-combine process. I'm led to believe that this is one of the stronger aspects of `pandas`' approach to DataFrames (versus R's), but haven't yet had a chance to really see the power."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# back to bigger df, without 'None' cols\n",
      "df1.head()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Use a groupby to collect all rows by language value, and subsequently use some of the methods available to `GroupBy` DataFrames. Note that the `GroupBy` methods will only act on (and the method call only return) values for columns where numerical calculation makes sense."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# subset df, create new df with only 'popular' accounts -- those matching the filter condition given\n",
      "pop_df = df1[df1[\"actor:followersCount\"] >= 100]\n",
      "\n",
      "# fix the klout scores again\n",
      "pop_df['gnip:klout_score'] = pop_df['gnip:klout_score'].map(floatify)\n",
      "\n",
      "# in case you need to remind yourself of the dataframe\n",
      "#pop_df.head()\n",
      "\n",
      "# use GroupBy methods for stats on each group:\n",
      "pop_df.groupby(\"twitter_lang\").size()      # number of elements per group\n",
      "#pop_df.groupby(\"twitter_lang\").sum()       # sum of elements in each group (obviously doesn't make sense for some cols) \n",
      "#pop_df.groupby(\"twitter_lang\").mean()      # algebraic mean of elements per group"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# though this looks like a normal dataframe, the DataFrameGroupBy object has a heirarchical index\n",
      "#  this means it may not act as you might expect.\n",
      "lang_gb = pop_df[['twitter_lang',\\\n",
      "             'gnip:klout_score',\\\n",
      "             'actor:followersCount',\\\n",
      "             'actor:friendsCount',\\\n",
      "             'actor:statusesCount']].groupby('twitter_lang')\n",
      "\n",
      "\n",
      "# note the new index 'twitter_lang' -- in this case, .head(n) returns <= n elements for each index\n",
      "lang_gb.head(2)  \n",
      "\n",
      "# see that they type is DataFrameGroupBy object\n",
      "#lang_gb"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# to get a DataFrame object that responds more like I'm used to, create a new one using the \n",
      "#   aggregate method, which results in a single-index DataFrame\n",
      "lang_gb_mean = lang_gb.aggregate(np.mean)  \n",
      "\n",
      "lang_gb_mean.head()\n",
      "\n",
      "# verify the single index\n",
      "#lang_gb_mean.index"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## part 3: plotting\n",
      "\n",
      "As far as I can tell, plotting in Python was not fun in the past. Below is some easy, base `matplotlib`, but 'nice' graphics take *a lot* of code. This situation is changing quite quickly now, with the success of `ggplot2` in the R world and the attempts to a) make `matplotlib` look less sucky, and b) implement the Grammar of Graphics in Python."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# .plot() is a pandas wrapper for matplotlib's plt.plot() \n",
      "lang_gb_mean['actor:followersCount'].plot(kind='bar', color='g')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# more base matplotlib \n",
      "plt.scatter(x=lang_gb_mean['actor:followersCount'],\\\n",
      "            y=lang_gb_mean['actor:friendsCount'],\\\n",
      "            alpha=0.5,\\\n",
      "            s=50,\\\n",
      "            color='red',\\\n",
      "            marker='o')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# now read the docs and copypasta a neat-looking plot\n",
      "from pandas.tools.plotting import scatter_matrix\n",
      "\n",
      "scatter_matrix(lang_gb_mean, alpha=0.5, figsize=(12,12), diagonal='kde', s=100)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Finally, a short taste of some other plotting libraries. My munging + plotting skillz in this world are still a work in progress, so I will definitely return to this section with an actual use-case in the future. For now, we'll make up some data for illustrative purposes."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# make up some data with large-scale patterns and a datetime index\n",
      "df = pd.DataFrame(np.random.randn(1000, 4), index=pd.date_range('1/1/2000', periods=1000), columns=list('ABCD'))\n",
      "df = df.cumsum()\n",
      "df.head()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df.plot()\n",
      "df.hist()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Now, let's use some other matplotlib wrappers to get a sense of how we can make this look a little better....\n",
      "\n",
      "- `prettyplotlib` essentially just fixes a bunch of matplotlib settings behind the scenes so your base methods lead to e.g. ColorBrewer palettes. This essentially overwrites the .matplotlibrc settings for this session with palettes and settings that are slightly nicer. The settings will remain until you start a new session."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import prettyplotlib\n",
      "\n",
      "df.plot()\n",
      "df.hist()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "- `ggplot` is a very new port of R's ggplot2 into Python and is in [very active development](https://github.com/yhat/ggplot/blob/master/TODO.md). Much of the really powerful aspects of R's ggplot is yet to be implemented, but even [the man himself](https://github.com/hadley/ggplot) is involved in the development, so be sure to check it out later down the road."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df['date'] = pd.date_range('1/1/2000', periods=1000)\n",
      "df.head()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "from ggplot import *\n",
      "df_lng = pd.melt(df, id_vars='date')\n",
      "\n",
      "df_lng.head()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ggplot(df_lng, aes(x='date', y='value', color='variable')) + \\\n",
      "    geom_line() + \\\n",
      "    stat_smooth(color='red')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ggplot(df_lng, aes(x='value', color='variable', alpha=1/2.)) + \\\n",
      "    geom_histogram(binwidth=0.1) + \\\n",
      "    ggtitle(\"Holy histogram!\") + labs(\"Count\", \"Frequency\") + \\\n",
      "    ylim(0, 15)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    }
   ],
   "metadata": {}
  }
 ]
}